Inference-based Decision Making in Games

نویسندگان

  • Tim Rakowski
  • Marc Toussaint
چکیده

Background: Reinforcement learning in complex games has traditionally been the domain of valueor policy iteration algorithms, resulting from their effectiveness in planning in Markov decision processes, before algorithms based on regret minimization guarantees such as upper confidence bounds applied to trees (UCT) and counterfactual regret minimization were developed and proved to be very successful, too. Meanwhile remarkably simple algorithms based on likelihood maximization where found for planning in Markov decision processes, which opened up room for new research. Applying these new methods to extensive games is the focus of this thesis. Results: We describe a generic schema for transforming an extensive game into a multi-agent partially observable Markov decision process (POMDP), derive a strategy update based on the EM algorithm and give an implementation using the hidden Markov model. Tests on a number of minimalistic games suggest that for the two-player case equilibrium strategies are found if the game has pure Nash equilibria but otherwise only the average payoffs of the two players converge to their respective values of a mixed Nash equilibrium, i.e. no equilibrium strategies are found. Further investigation showed that the algorithmic framework is general enough to facilitate the replacement of the M-step by other update procedures such as the polynomial weights algorithm (resulting in external regret minimization) or the counterfactual regret minimization method. Using the latter update, the strategies do converge. Eidesstattliche Erklärung Ich versichere hiermit an Eides Statt, dass diese Arbeit von niemand anderem als meiner Person verfasst worden ist. Alle verwendeten Hilfsmittel wie Berichte, Bücher, Internetseiten oder ähnliches sind im Literaturverzeichnis angegeben, Zitate aus fremden Arbeiten sind als solche kenntlich gemacht. Die Arbeit wurde bisher in gleicher oder ähnlicher Form keiner anderen Prüfungskommission vorgelegt und auch nicht veröffentlicht. Berlin, 7. Juni 2011

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A history of evidence in medical decisions: from the diagnostic sign to Bayesian inference.

Bayesian inference in medical decision making is a concept that has a long history with 3 essential developments: 1) the recognition of the need for data (demonstrable scientific evidence), 2) the development of probability, and 3) the development of inverse probability. Beginning with the demonstrative evidence of the physician's sign, continuing through the development of probability theory b...

متن کامل

Cost Analysis of Acceptance Sampling Models Using Dynamic Programming and Bayesian Inference Considering Inspection Errors

Acceptance Sampling models have been widely applied in companies for the inspection and testing the raw material as well as the final products. A number of lots of the items are produced in a day in the industries so it may be impossible to inspect/test each item in a lot. The acceptance sampling models only provide the guarantee for the producer and consumer that the items in the lots are acco...

متن کامل

Pattern of Decision-Making Evaluation in Urban Regeneration

Evaluation is one of the most important necessities in the process of urban regeneration, which leads to the optimal decision to solve the problem of urban inefficiency. Evaluating decisions in the urban regeneration process, given its complexity, ambiguity and uncertainty, is a critical issue that requires identifying the criteria that affect its realization and using fuzzy decision-making met...

متن کامل

An integrated Decision-Making Approach for Road Transport Evaluation in a Sustainable Supply Chain

One important step to achieve a sustainable transportation system is to control the impact and evaluate the effect of various influencing factors toward three dimensions of sustainability. Within this context, diverse analytical approaches have been developed to assess the sustainability level of various transportation systems, however, sustainability evaluation based on fuzzy multiple criteria...

متن کامل

Investment Decision-Making about Portfolio of Technology Development Projects; Based on the Analysis of Success Criteria using Fuzzy Neural Network and MADM

Technology development project is a type of investment project and it is important to identify the performance indicators and planning for the correct investment. The purpose of this research is the development of indicators of portfolio success, accurate analysis of the effects of indicators on each other and the achievement of a proper investment model. In this research, the success criteria ...

متن کامل

Thyroid disorder diagnosis based on Mamdani fuzzy inference system classifier

Introduction: Classification and prediction are two most important applications of statistical methods in the field of medicine. According to this note that the classical classification are provided due to the clinical symptom and  do not involve the use of specialized information and knowledge. Therefore, using a classifier that can combine all this information, is necessary. The aim of this s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011